Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak Logs #1159

Merged
merged 15 commits into from
Nov 21, 2024
Merged

Memory Leak Logs #1159

merged 15 commits into from
Nov 21, 2024

Conversation

payalcha
Copy link
Contributor

@payalcha payalcha commented Nov 19, 2024

Reviewed PR - #1155 (Had to close as some unsigned commits were there)

Openfl component change
Add memory leak logs for aggregator and collaborator, if log_memory_usage flag can be set for ubuntu.

Changes -

  1. aggregator.py - Function get_memory_usage added , which collects process as well as system log.
  2. collaborator.py - Function get_memory_usage added , which collects process as well as system log.
    Output -
    agg_mem_details.json - details of aggregator in json format
    _mem_details.json - details of aggregator in json format

Structure -
{
"round_number": 0,
"metric_origin": "collaborator1",
"process_memory": 1155.19,
"virtual_memory": {
},
"swap_memory": {
}
}

Test changes

  1. log_memory_usage flag is added in conftest.py and passed to test when set.
  2. Correction -
    federation_helper.py - _verify_completion_for_participant - added timeout on the basis of num_rounds.

Signed-off-by: Chaurasiya, Payal <[email protected]>
Signed-off-by: Chaurasiya, Payal <[email protected]>
Signed-off-by: Chaurasiya, Payal <[email protected]>
Signed-off-by: Chaurasiya, Payal <[email protected]>
Signed-off-by: Chaurasiya, Payal <[email protected]>
Signed-off-by: Chaurasiya, Payal <[email protected]>
Copy link
Collaborator

@MasterSkepticista MasterSkepticista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@payalcha Per our last discussion, I'm not in favor of this PR for the following reasons:

  1. There is evidence of memory growth from internal experiments, which means we need to identify the source of the growth/leak to disposition it. If the investigation suggests changes to openfl to fix the issue, that is a great PR to have.
  2. In a situation where the leak/growth is due to an underlying framework (pandas, tf, torch etc.), or it falls outside of openfl purview, it needs to be tracked in the respective framework repository.

That said, if the overall decision is to still add this telemetry functionality, I suggest you use decorators over functions that you want to log memory for, within your CI tests. All helper functions can live within tests/. I don't think framework source code (anything within openfl/) is the right place for it, because users are neither required nor encouraged to use it in their usual experiments.

openfl/component/aggregator/aggregator.py Outdated Show resolved Hide resolved
openfl/utilities/logs.py Outdated Show resolved Hide resolved
openfl/component/collaborator/collaborator.py Outdated Show resolved Hide resolved
openfl/utilities/logs.py Outdated Show resolved Hide resolved
openfl/utilities/logs.py Outdated Show resolved Hide resolved
openfl/utilities/logs.py Outdated Show resolved Hide resolved
openfl/utilities/logs.py Outdated Show resolved Hide resolved
Signed-off-by: Chaurasiya, Payal <[email protected]>
@teoparvanov teoparvanov merged commit 073b2f0 into securefederatedai:develop Nov 21, 2024
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants